Fitting binary regression models with case-augmented samples
نویسندگان
چکیده
S In a case-augmented study, measurements on a random sample from a population are augmented by information from an independent sample of cases, that is units with some characteristic of interest. We show that inferences about the effect of the covariates on the probability of being a case can be made by fitting a modified prospective likelihood. We also show that this procedure is fully efficient. 1. I This paper is concerned with the problem of fitting a regression model, pr(Y =1|x)=p(x; b) (1) say, relating the mean of a binary response variable, Y , to a p-dimensional vector of covariates, x, in situations where observations from a random sample drawn from the whole population are augmented by data from an independent sample of cases, i.e. units with Y =1. There are two distinct situations to be considered. In the first, Design 1, we have information about the covariate values, x, but not the responses, Y , for individuals in the original random sample; we shall call this the reference sample. In the second, Design 2, we have information about both Y and x for members of the reference sample. Cosslet (1981) uses the terms case-supplemented sampling to describe the first situation and case-enriched sampling for the second. We shall use these terms and the general term case-augmented sampling to cover both situations. Both designs can be regarded as variants of the standard, unmatched case-control design, except that here controls are drawn at random from the whole population rather than from the noncases, i.e. units with Y =0. Obviously there is very little difference between the designs if cases are rare in the population, but one of the advantages of the case-augmented designs is that they allow us to estimate relative risks without invoking the 'rare disease' assumption, for, with a simple application of Bayes' Theorem, we can write the risk at covariate value x relative to that at some baseline, x 0 , as pr(Y =1|x) pr(Y =1|x 0) = g(x|Y =1) g(x 0 |Y =1)N g(x) g(x 0) ,
منابع مشابه
The Analysis of Bayesian Probit Regression of Binary and Polychotomous Response Data
The goal of this study is to introduce a statistical method regarding the analysis of specific latent data for regression analysis of the discrete data and to build a relation between a probit regression model (related to the discrete response) and normal linear regression model (related to the latent data of continuous response). This method provides precise inferences on binary and multinomia...
متن کاملMaximum Expected F-Measure Training of Logistic Regression Models
We consider the problem of training logistic regression models for binary classification in information extraction and information retrieval tasks. Fitting probabilistic models for use with such tasks should take into account the demands of the taskspecific utility function, in this case the well-known F-measure, which combines recall and precision into a global measure of utility. We develop a...
متن کاملBinary Regression With a Misclassified Response Variable in Diabetes Data
Objectives: The categorical data analysis is very important in statistics and medical sciences. When the binary response variable is misclassified, the results of fitting the model will be biased in estimating adjusted odds ratios. The present study aimed to use a method to detect and correct misclassification error in the response variable of Type 2 Diabetes Mellitus (T2DM), applying binary ...
متن کاملNew Approach in Fitting Linear Regression Models with the Aim of Improving Accuracy and Power
The main contribution of this work lies in challenging the common practice of inferential statistics in the realm of simple linear regression for attaining a higher degree of accuracy when multiple observations are available, at least, at one level of the regressor variable. We derive sufficient conditions under which one can improve the accuracy of the interval estimations at quite affordable ...
متن کاملLocally adaptive function estimation for binary regression models.
In this paper we present a nonparametric Bayesian approach for fitting unsmooth or highly oscillating functions in regression models with binary responses. The approach extends previous work by Lang et al. for Gaussian responses. Nonlinear functions are modelled by first or second order random walk priors with locally varying variances or smoothing parameters. Estimation is fully Bayesian and u...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006